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O ; ABSTRACT 

'- ^ • Traditional photometric redshift methods use only color information about 

^ ■ the objects in question to estimate their redshifts. This paper introduces a new 

5-H ■ method utilizing colors, luminosity, surface brightness, and radial light profile to 

^ ■ measure the redshifts of galaxies in the Sloan Digital Sky Survey (SDSS). We 
take a statistical approach: distributions of galaxies from the SDSS Large-Scale 

CN ■ Structure (LSS; spectroscopic) sample are constructed at a range of redshifts, 

^ ■ and target galaxies are compared to these distributions. An adaptive mesh is 

^ ■ implemented to increase the percentage of the parameter space populated by 

■ the LSS galaxies. We test the method on a subset of galaxies from the LSS 

■ sample, yielding rms Az of 0.025 for red galaxies and 0.030 for blue galaxies (all 
^ ■ with z < 0.25). Possible future improvements to this promising technique are 
O ■ described, as is our ongoing work to extend the method to galaxies at higher 

>• ■ redshift. 

X: 

■ Subject headings: galaxies: distances and redshifts — techniques: photometric 
— catalogs 



INTRODUCTION 



Since iHubbld ( 119291 ) discovered a linear relationship between the distances and redshifts 
of other galaxies, redshift measurements have been the primary method for determining 
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distances to extragalactic objects. This is normally done using spectra of sufficiently high 
resolution that individual spectral lines can be resolved and matched to the same features 
in nearby objects, or by matching the spectrum to a model. 

However, measuring the spectrum of an object with high spectral resolution and suf- 
ficiently high signal-to-noise requires a significantly longer integration time than recording 
broadband photometry of comparable quahty. Thus, it is desirable to be able to measure 
an object's redshift from broadband photometry alone. Redshifts measured this way are 
called photometric redshifts, or photo- 2;'s. Throughout this paper, we will refer to objects 
for which photo- 2;'s are sought as targets. 



Photo-z techniques date back to iBaunj (119621 ) , who combined nine photometric bands 
to form low-resolution SEDs for elliptical galaxies. These traced the steep 4000 A break 
feature, which remains an excellent tool for photo-2 determination since it produ ces a strong 
difference in flux between whichever two passbands straddle it at a given redshift. iKod (119851 ) 
was able to measure fairly accurate photo- 2;'s for both red and blue galaxies using only 3 or 
4 photometric passbands; his method involved comparisons of observed ga laxy colors with 
those predicted by the Bruzual spectral evo lution models (IBruzuall Il983l . and companion 
papers cited therein) at a range of redshifts. IConnoUy et al. Jl995h took a purely empirical 
(training set-based) approach, deriving a correlation between fo ur-band photometric data 
and the measured spectroscopic redshifts of a sample of galaxies. ISawicki. Lin fc Yed (119971 ) 
compare four-band target photometry to that predicted by empirical template spectra. More 
recently, hy brid techniques combining spectral template-fit ting with training sets have been 
introduced JBudavari et~aDl20od Icsabai et al.lboool . boosh . 



All of the methods listed above use only the photometric fluxes (i.e. colors or apparent 
magnitudes) of their targets for calculating photo- 2;'s. However, galaxy images generally yield 
additional geometrical information, such as angular size, shape, a nd light dis tribution (radial 
and azimuthal). In a review of photometric redshift techniques, iKoo Jl999h suggested that 
galaxy structural parameters — including surface brightness and radial light profile — could 
be used to reduce the number of passb ands needed for precis e redshift estimates. Indeed, 
the bulge-to-tota l flux ratio was used by lSarajedini et al.l (119991 ) along with /-magnitude and 
V — I color, and iKurtz et al.l (120071 ) have recently developed a novel method that uses only 
one color and the surface brightness from a single band. 

Supervised neural networks have recentl y been used to compute photo-z's from a range of 



input parameters, including Petrosian radii (jFirth. Lahav. fc Somervilldl2003l : IVanzella et al. 



concentra ti on index (ICoUister fc Lahavl 



20041 ). surface brightness and axial ratios 



( Ball et al.ll2004l ). iD'Abrusco et al.l ( 20071 ) have inco rporated Pe t rosian radii and informa- 



tion about the radial profile into their neural network. IWadadekarl (120051 ) has used a different 



- 3 - 



machine learning method t o compute photo- 2 : ' s base d on five passband fluxes along with the 
concentration index, while IWay &: Srivastaval (120061 ) have used ensemble learning and Gaus- 
sian process regression to derive photo-z's from colors and various morphological parameters. 

This paper introduces a new, statistically-based photo-z technique, first conceived by 
David Schlegel, that uses surface brightness and the Sersic index — a measure of the radial 
light profile — in addition to five-band photometry. The method is empirical: the seven prop- 
erties listed are measured for a spectroscopic sample of galaxies, whose redshift information 
is used to estimate photo- 2;'s for the target galaxies. 

Note that photometric redshifts have also been successfully applied to quasars (e.g., 
Richards et al.ll200ll : iBudavari et al.ll200ll ). This paper focuses on galaxy photo-z's. 



The paper is structured as follows: in §2, we describe the spectroscopic sample of galaxies 
used by the photo-z code. The photo-z technique and its development are discussed in §3, 
along with other variations that were explored. A test of the photo- 2; code is described in 
§4. We present our conclusions in §5, and suggest future improvements for increasing the 
accuracy and applicability of the method. 



2. THE SOURCE SAMPLE 



2.1. SDSS & the NYU-VAGC 



As of its F ourth Data Release (lAdelman-McCarthv et al.l 120061 ). the Sloan Digital Sky 
Survey (SDSS; lYork et al.l I2OOOI : iGunn et al.l Il998l . l2006l ) has imaged roughly 7000 square 
degrees of sky in five bands (u, q, r,i, z) ra nging from the near-ultraviolet to the near-infrared 
(IFukugita et al.l Il996l : ISmith et al.ll2002l ). Follow-up spectroscopy has been performed on 



objects selected by one of several precisely define d target selection algorithms (IStrauss et al. 



2002; Eisenstein et al. 2001 : Richards et al.|[2002l ). SDSS has measured ~ 10® galaxy spectra, 
but the number of galaxies detected in SDSS imaging is greater by roughly two orders of 
magnitude. Thus, despite the great size of the SDSS spectroscopic sample, which includes 
both a "Main" sample (fiux-limited to r = 17.77) and a Luminous Red Galaxy (LRG) sample 
(fiux- and color-selected, reaching down to r = 19.5), the huge size of the imaging survey 
makes it a very attractive target for photometric redshift techniques. Thus we work with 
SDSS data, although the method is in principle applicable to any other imaging survey with 
similar observable parameters. 



The New York University Value-Added Galaxy Catalog (NYU-VAGC; iBlanton et al. 



2OO5I ) is essentially an "extended Main sample;" it extends the low-magnitude limit down to 
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r = 18, and makes the other cuts on the Main sample less restrictive. It also includes all 
galaxies within 2 arcseconds of any target from the Main, LRG, or QSO samples, and thus 
is useful for analyzing large-scale structure. In fact, also available are subsets of the NYU- 
VAGC called Large-Scale Structure (LSS) samples, which contain only well-characterized 
galaxies with measured spectroscopic redshifts. These samples are continually updated and 
expanded; we use samplel4, which contains 221,617 galaxies with good photometry. Specif- 
ically, our sample results from an apparent magnitude cut, 14.5 < r < 17.5, an absolute 
magnitude cut, —23. < Mr < —17., and a redshift cut, 0.01 < z < 0.25. The redshift cut 
eliminates only a handful of galaxies that are not already eliminated by the photometric 
cuts. 

Finally, the NYU-VAGC also contains a few derive d parameters, including -ft^-corrections 
and Sersic indices for all galaxies. The Sersic index n (jSersidll968l : iGraham fc Driverll2005l ) 
is defined by fitting the radial surface brightness profile with a model of the form: 



/(r) = Aexp[-(r/ro)^/"] 



(1) 



The value n = 1 produces an exponential light profile, typical of late-type galaxies (in addi- 
tion to some low-luminosity early- type galaxies), whereas n = 4 produces a "de Vaucouleurs 
profile," long considered a good description for many early-type galaxies. The SDSS pho- 
tometric pipeline only performs fits for these two particular values, because comp uting an 
arbitrary best - fit va lue is computationally very expensive (IStoughton et al.ll2002l ). Thus, 
Blanton et al.l (120051 ) calculate this best-fit value of n themselves, for each galaxy in the 
NYU-VAGC (though they do the fits to circularly averaged profiles, whereas the SDSS 
pipeline performs a full 2-dimensional elliptical fit.) 



2.2. Examining the LSS samples 



Blanton et al.l (l2003al ) used the slightly older LSS samplel2, with cuts very similar to 



the ones we used, to examine correlations among observable properties of SDSS galaxies. The 
quantities they studied were the four colors u — g,g—r,r — i,i — z] the absolute magnitude Mf, 
the surface brightness fii] and the Sersic index n, with all parameters "corrected" to z = 0.1. 
That is, using each galaxy's redshift, its colors were fC-corrected to the rest frame, but to 
ugriz bandpasses shifted blueward by a factor (1 + 0.1) in A. The absolu t e magn itude and 
surface brightness are also for the {z = 0.1)-shifted i-band. iBlanton et al.l (l2003al ) produced 
arrays (e.g. their Fig. 7) of two-dimensional galaxy distributions for each pair of the seven 
properties listed, and discussed in depth the features of these bivariate distributions. The 
plots along the diagonal of their Fig. 7 are one- dimensional distributions of each property. 
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We use samplel4 to generate similar plot arrays at a range of redshifts (Figs. 1-4), but we 
choose to use the apparent magnitude i, i^'-corrected and corrected for cosmological surface 
brightness dimming, instead of a band-shifted Mj. Thus all properties plotted are photo- 



are per formed using the IDL code Kcorrect v3_2 (IBlanton et al.ll2003br ). As inlBlanton et al. 



feOOSaf). all magnit udes are Petrosian magnitudes (see descriptions in iBlanton et al. 



As in 


Blanton et al. 


Blanton et al. 


2001; 



Strauss et al.ll2002l ). which measure a fractio n of the galaxy light that is constant with dis- 
tance or size (ignoring the effect of seeing); I Graham et al.l (l2005l ) have described a simple 
method for converting Petrosian magnitudes to total magnitudes. 



Note that Figs. 1-4, like IBlanton et al.l (l2003a( )'s Fig. 7, attempt to show what a true 
sample of galaxies at the indicated redshift looks like; this is achieved by weighting each 
galaxy by 1/Vmax, where V max is "the volume co vered by the survey in which this galaxy 
could have been observed" (IBlanton et al.ll2003al ). This weighting accounts for the window 
function of th e survey and the redshift distribution of the galaxies in the sample; §3.4 of 
Blanton et al.l (l2003al ) provides further details. As a result of this weighting, our 1-D i- 
distributions have the form of Schechter functions, but with a sharp drop at the faint end 
due to the absolute magnitude cut described above (the drop-off is not vertical because the 
cut was performed in the r-band). 

Comparing Figs. 1-4 reveals the changes in photometric properties that occur as the 
same sample of galaxies is observed at different redshifts. These changes are plotted directly 
in Figs. 5-6. Five randomly selected galaxies that appear faint and blue (at z = 0.1) and have 
exponential profiles are plotted at a range of redshifts (Fig. 5); the same is done separately 
for five randomly selected bright (L*), red, de Vaucouleurs galaxies (Fig. 6). The plots along 
the diagonal of each figure have redshift z increasing along the horizontal axis. 

By comparing Figs. 5-6 with Fig. 2, one sees that the Sersic index is a very useful 
parameter for red galaxy photo- 2;'s, since it is constant with redshift while all other properties 
are not, and red galaxies exhibit a wide range in n. That is, the trajectory along which a red 
galaxy moves in redshift (Fig. 6) is roughly perpendicular to the galaxy distribution in all 
the 2-D plots containing n. The i-band apparent magnitude is also clearly a useful property 
when combined with any of the other observables: it changes strongly with redshift, and 
the red and blue galaxy trajectories never overlap in the 2-D plots. Note that there are 
degeneracies in some of the color-color plots (i.e., high-z blue galaxies look like low- 2; red 
galaxies), particularly those incorporating r-band data but not w-band data. However, the 
other colors and the apparent magnitude clearly are sufficient to break the degeneracy. 
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3. THE PHOTO-Z CODE 
3.1. Theory 

We can determine a galaxy's redshift by combining its apparent (observable) properties 
with absolute quantities, i.e. by specifying its type T. Thus, for a given galaxy targeted 
for photo- 2; measurement, we want to find the peak of -P(T), the probability distribution of 
galaxy types that it could be. This information will allow us to compute its redshift. 

The starting assumption of our photo-2; technique is that the (shifted) empirical galaxy 
distributions of §2.2 can be used as probability distributions. That is, we want to use the 7-D 
distribution of the previously named observables (of which Figs. 1-4 show 2-D projections), 
corrected to a given redshift z, to approximate P{T\z), the probability distribution of galaxy 
types at that redshift. If the redshift corrections are reliable, then this should be a fairly 
good approximation given the large sample size. According to Bayes' Theorem, a photo-z 
can then be computed as the redshift that maximizes 

PiT) = P{T\z)*P{z), (2) 

where P{z) is the total probability distribution of redshifts for the sample of target galaxies. 
Estimating this function well will be an important step in applying this photo- 2; method to 
any new target sample. 

The 7-D distributions are generated across a range of redshifts that is believed to cover 
all galaxies in the target sample, with an interval between the redshifts that is less than the 
rms error of the photo-z's. At each redshift, a target galaxy falls somewhere in the P{T\z) 
distribution, and the value P{T\z) * P{z) is computed and stored for comparison to values 
at other redshifts. 

Initially, a slightly different approach was considered: only one distribution would 
be generated, and each target galaxy would be assigned many different redshifts in turn. 
Roughly speaking, the best-fit photo-z would then be that which places the target at the 
highest point in the distribution. However, Figs. 1-4 demonstrate that the distributions 
change shape with redsh ift, so informat i on wou ld be lost with this approach. Furthermore, 



for reasons described by iBlanton et al.l (l2003bl ). "one can observe a galaxy at z = 0.1 and 



reliably infer what it would look like at z = 0.3; it is only the reverse process that is diffi- 
cult." Since the median redshift of the LSS samples is 2; ~ 0.1, we are much better off doing 
i^-corrections to the sample galaxies than to a target galaxy that may have redshift z ~ 0.3. 
Finally, the multiple-distribution method is more computationally efficient because we can 
generate the requisite distributions just once and store them, so that no i^'-corrections need 
be performed when we run the code on a set of targets. For all of these reasons, the method 
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of generating multiple distributions is favored. 



3.2. Implementation 

We use IDL to implement the algorithm described above. Distance moduli (for shifting 
the source galaxies) are computed usi ng the cosmological p arameters flm = 0.3, = 0.7, 



and Hq = 100 km/s/Mpc (following iBlanton et al.l l2003al ). To avoid assigning as photo- 



n's only those discrete redshifts at which the distributions are generated, we interpolate 
quadratically between the maximizing redshift and its immediate neighbors at higher and 
lower z. We assign the z-value corresponding to the peak of the fit parabola. For galaxies 
assigned the minimum or maximum redshift tested, we simply use that value; however, the 
redshift range can always be expanded so that there are few of these cases. 

The shifted galaxies are placed into cells in a 7-D array, each dimension of which spans a 
range broad enough to include virtually every galaxy in the source sample, at every redshift 
to be tested. Given this broad range, we must have a large number of cells in each dimension 
in order to have reasonably high type-resolution. However, the resolution is limited by both 
the amount of memory available on the system on which the code is run (this is a real problem 
for 7-D arrays of numbers that can become fairly large near a peak in the distribution), and 
2) the fact that the number of points (source galaxies) that go in the array is fixed, so that 
increasing resolution makes the array more and more sparsely populated. 

We balance these competing factors by using a resolution of 15 cells per dimension. 
However, for a typical distribution generated at this resolution (in particular, for z = 0.01), 
only ~ 0.03% of the cells in the array are populated, and the majority of these contain just 
one galaxy. Therefore an adaptive mesh is implemented, "smoothing" each single-galaxy cell 
across all neighboring cells. Specifically, the occupation number of each cell is multiplied 
by a (large) constant A^, and then all cells that lie within one unit (in any combination of 
dimensions) of a single-galaxy cell are populated with numbers, the total of which — for any 
given single-galaxy cell — is A^. Thus, after the initial multiplication by N, no points are 
added to the distribution; it is merely smoothed around each cell that formerly contained a 
single galaxy. 

Furthermore, not all the cells newly populated by this step are given the same value, 
for they lie at different distances in parameter space from the central cell (the one that had 
only one galaxy). For example, a cell that has six coordinates in common with the central 
cell and only one that differs by unity is much "closer" than a cell with all seven coordinates 
differing by unity from those of the central cell. Thus we compute the center-to-center 
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distance between each cell and the central one (in units of a cell), and place values in the 
cells that are inversely proportional to that distance. The central cell gets the largest value 
of all, though this is greatly reduced from the value it had before smoothing. 

After this smoothing is performed, the z = 0.01 distribution mentioned previously 
populates ~ 3% of the array, an improvement by two orders of magnitude. In the next 
section, we will see how this change affects photo- 2; measurements. 

4. RESULTS 

Photometric redshift routines are usually tested by applying them to objects with known 
(i.e., spectroscopic) redshifts. Since redshifts are known for all galaxies in LSS samplel4, we 
can simply trim the sample that we use to generate the distributions, and use the remaining 
galaxies as the target sample. Specifically, we test the code on 1/4 of the sample (55,405 
galaxies), using only the remaining 3/4 to generate the distributions. Distributions are 
generated over the redshift range 0.02 < z < 0.30, at intervals of 0.02 in z (note that 
the upper limit extends beyond the greatest redshift present in our source sample; still, we 
include z = 0.30 in order to verify that no galaxies are incorrectly assigned such a high 
redshift). 

As explained in §3.2, an estimate of P{z) for the target sample is needed. In this special 
case, P{z) is the same for both the source and target distributions. P{z) is usually "divided 
out" from the source population when each galaxy is weighted by l/Vmax- Instead, in this 
case we can avoid estimating P{z) entirely by giving each source galaxy a weight equal to 
unity, effectively skipping the division by P{z). Then there is no need to multiply by P{z) 
later, for the probability computed from the distribution at each given redshift z gives us 
P(T) directly. Fig. 7 shows the 2-D projections of a unity-weighted distribution, as used in 
this particular test. 

We define Az = z — Zphou where z is the spectroscopic redshift and Zphot is our photo-z. 
Without the adaptive mesh smoothing, this test yields an rms Az of 0.029, with systematic 
offset of essentially zero (mean Az ~ —0.0005). However, our failure rate, i.e. the percentage 
of galaxies that are not assigned a redshift because they do not fall inside an occupied cell 
at any of the redshifts tested, is ~ 29%. With the smoothing incorporated, the failure rate 
drops to ~ 11.3%, which should be acceptable for most purposes; the rms Az also improves 
slightly, to ~ 0.0275. Fig. 8 is a plot of Zphot vs. z for all the galaxies here tested. 



In addition, we examine the performance of t he photo- 2: code on r ed and blue galaxies 
separately, using the "optimal color separator" of IStrateva et al.l (1200 ll ). u — r = 2.22. The 
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target sample, thus divided, contains 25,296 "blue" galaxies and 30,109 "red" galaxies. The 
rms A2; for the red galaxies is ~ 0.0246; for the blue galaxies, it is ~ 0.0303. Interestingly, 
the red galaxies have a notably higher failure rate (~ 16.5%) than the blue galaxies (~ 5.1%). 
Figs. 9 & 10 are plots of Zphot vs. z for the red and blue galaxy subsets, respectively. Table 1 
divides the target sample even further, both by m— r color and by i-magnitude, and shows the 
variation of rms Az with these parameters. The errors are smaller for the brighter galaxies 
of all colors, despite the fact that the fainter galaxies are more numerous in both the training 
set and target sample. 

Table 2 compares our photo- ^: accuracy to t hat a chieved by other methods. Our rms 
Az is lower than that obtained by ICsabai et al.l (120031 ) using two t emplate-fitting r aetho ds 
and their own hybrid technique, and comparable to the results of IConnoUv et al.l (119951) 's 
quadratic-fitting approach and the support vector machine method of Wadadekar (j2005 ). 
The template-fitting metho ds also produce sigri ificant systematic offsets (underestimates), 
while our method does not. ICsabai et al.l (120031 ) reported rms Az of 0.029 for red galaxies 
and 0.04 for blue galaxies, so our method show s the most pronounced improvement in the 
photo- 2;'s for blue galaxies. ICsabai et al.l ( l2003l ) used a smaller sample of ~ 35, 000 galaxies, 
but using smaller training sets does not significantly increase the errors from our method 
(Mandelbaum et al., in preparation). 



Padmanabhan et al.l (120051 ) have achieved rms Az ~ 0.03 using a template-fitting ap- 



proach, but they used the deeper SDSS LRG sample, so their results are not directly com- 
parable to ours. 

As Table 2 show s, smal ler rms Az has been obtained using the neural network technique 
of ICoUister fc Lahavl J2004h and two techniques (ensemble model and Gaussian process re- 
gression) introduced by IWav fc Srivastaval (120061). Other neural network methods have sira - 
ilarly attained rms Az ~ 0.02 JVanzella et al.lbooi ball et al.l[20oi : IP'Abrusco et allboOTh . 
However, our method is arguably more transparent than the neural network techniques. The 
next section discusses additional improvements that could further reduce our errors in future 
implement at ions . 



5. CONCLUSIONS 

We have described a new method for determining photometric redshifts of SDSS galax- 
ies. The method is empirical, and uses a large spectroscopic sample of SDSS galaxies to infer 
distributions of galaxy properties at a range of redshifts. The best-fit redshift is determined 
by comparing these distributions to a galaxy for which a photo-^ is desired. The properties 
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used are the five-band SDSS photometry, along with surface brightness and the Sersic index. 
This represents one of the first alternatives to neural networks for deriving photo-z's from 
imaging information beyond the photoelectric fluxes. 

Our test of the method produces rms = 0.025 for red galaxies in the Main sample, 
and rms = 0.030 for blue galaxies. These variances are an improvement over those 
achieved by template-fitting and hybrid photo-^ codes previously applied to SDSS galaxies, 
but are somewhat worse than the errors typical of neural network methods. 

Implementing an adaptive mesh reduces our method's failure rate, but has only a small 
effect on the rms A2;, so further adjustments to the smoothing technique alone would not 
likely reduce our errors. Similarly, training sets even larger than the 166,212 galaxies used in 
our test are unlikely to improve the errors significantly (Mandelbaum et al., in preparation). 
Because our errors are currently larger than the redshift spacing (0.02 in z) used in generating 
the arrays for the test described here, generating the arrays at finer intervals does not by 
itself reduce our errors. 

One modification that may help would be to change the cell spacing for various observ- 
ables in the array — e.g., for the Sersic index, cells could be evenly spaced in log(n) rather than 
evenly spaced in n. Alternatively, the spacing could be chosen (for any or all observables) 
such that the peaks in the distribution are spread across many cells, effectively providing 
higher resolution in P{T\z). This approach would have the added advantage of populating 
a larger fraction of the array, potentially reducing the failure rate. 

Looking ahead, the next major challenge for photometric redshift techniques (including 
our own) is to make them applicable to higher-redshift galaxy samples. At redshifts only a 
little higher than the maximum for our sample, the intrinsic evolution of the target galaxies 
becomes significant. This evolution can be calculated with some reasonable confidence for 
the red, passively evolving galaxies, but not for the actively star-forming blue ones. 

In any case, it is clear that to extend the present techniques to higher redshifts, evo- 
lutionary corrections will have to be apphed if one wishes to use the SDSS Main sample 
to generate the 7-dimensional probability arrays. Of course, this approach will require one 

to estimate the redshift distribution P{z) of the target sample in order to compute the in- 
dividual galaxy redshifts. Alternatively, deeper surveys covering the larger redshifts could 
be used to generate a high- 2; training set, but the necessity to populate the arrays and de- 
termine evolutionary effects sclf-consistently demands very large datasets. It is likely that 
moderate-sized deep surveys can be used to verify empirical evolutionary corrections to the 
SDSS Main sample for higher-redshift photo-z estimates, and this is the path now being 
pursued here (Mandelbaum et al., in preparation). There are several redshift surveys deeper 
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than the SP SS spectroscopic sample that ov erlap with SDSS imaging, inclu ding the pEEP 2 



survey (e.g. iDavis. Gerke. &: NewmanI l2005l ) and the CN0C2 survey (e.g. iLin et al.lll998l ). 
which can be used in this endeavor, allowing us to probe more deeply the spatial distribution 
of galaxies throughout the second half of cosmic history. 
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Table 1: Our Photo-z Errors as a Function of Color and Apparent Magnitude 











u — r 








i 


< 1.5 


1.5-1.75 


1.75-2.0 


2.0-2.25 


2.25-2.5 


2.5-2.75 


> 2.75 


< 15.5 


0.0130 


0.0156 


0.0184 


0.0209 


0.0181 


0.0171 


0.0177 


15.5-16 


0.0185 


0.0212 


0.0224 


0.0247 


0.0220 


0.0200 


0.0199 


16-16.25 


0.0226 


0.0263 


0.0250 


0.0272 


0.0255 


0.0199 


0.0216 


16.25-16.5 


0.0256 


0.0270 


0.0311 


0.0298 


0.0285 


0.0229 


0.0225 


16.5-16.75 


0.0272 


0.0317 


0.0335 


0.0319 


0.0274 


0.0243 


0.0239 


16.75-17 


0.0270 


0.0319 


0.0341 


0.0363 


0.0300 


0.0251 


0.0242 


> 17 


0.0317 


0.0343 


0.0363 


0.0380 


0.0334 


0.0266 


0.0265 



Note. — Most (color, magnitude) bins contain between 500 and 2000 galaxies; the least populated bin 
(u — r < 1.5, z < 15.5) contains 189 galaxies. 



Table 2: Comparison of Photo-z Errors from Different Techniques 



Method rms Az 

CWW templates 0.067 

BC templates 0.055 

Hybrid 0.035 

Our Method 0.0275 

SVM 0.027 

Quadratic fitting 0.026 

Gaussian Process 0.023 

ANNz 0.019 

Ensemble model 0.019 



Source 
Csabai et al. f2003l 
Csabai et al. f2003l 
Csabai et al. (2003) 
This work 
Wadadekar (2005) 
Way fc Srivastava (20061 
Way fc Srivastava (^006) 
Way fc Srivastava (20061 
Wav fc Srivastava (20061 



Note. — Photo-z errors of our method compared to those produced by other methods on similar 
large catalogs of SPSS Main sa mple galaxies. The first two methods used the spectral templates of 



Coleman. Wu. fc WeedmanI (|l980 ) and Bruzual fc Charloti (j 19831 ). respectively. The quadratic fitting method 



is similar to that introdu ced bv lConnollv et al.l (|l995h . The ANNz neural network code is that presented by 



CoUister fc Lahavl (|2004l ). 
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Fig. 1. — Properties of samplel4 galaxies, A'-corrccted and corrected for cosmological surface 
brightness dimming, to ;z = 0.05; fii is the i-band surface brightness, n is the Sersic index, 
and i the i-band apparent magnitude. Galaxies are weighted by 1/Vmax (explained in the 
text, §2.2). Note that each 2-D plot is duplicated (reflected about the diagonal). The sharp 
cutoff that appears in the distribution of i-magnitudes is due to the r < 17.5 cut imposed 
on samplel4. 
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Fig. 2. — Same as Fig. 1, but for z — 0.1. 
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Fig. 3. — Same as Fig. 1, but for z — 0.3. 
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Fig. 4. — Same as Fig. 1, but for z — 0.5. 
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Fig. 5. — Seven photometric properties of five randomly selected faint, blue, exponen- 
tial galaxies in the LSS samplel4, plotted at a range of redshifts (specifically, at z = 
0.05, 0.075, 0.1, 0.2, 0.3, 0.4, &0.5). The galaxies were selected from a compact 7-D "box" 
at z = 0.1. 
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Fig. 6. — Same as Fig. 5, but for five randomly selected bright (L*), red, de Vaucouleurs 
galaxies. 
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Fig. 7. — Same as Fig. 2, but here each galaxy enters the distribution with weight 1, instead 

of 1/Knax- 



-22- 




Fig. 8. — Our photo- 2; vs. the spectroscopic z for all galaxies in the samplel4 subset used 
for testing, as described in §4 (49,158 galaxies). Rms is ~ 0.0275. 
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Fig. 9. — Same as Fig. 8, but with only the red galaxies (those with u — r > 2.22) plotted 
(25,146 galaxies). Rms Az is ~ 0.0246. 



0.05 0.10 0.15 0.20 0.25 

spectro-z 

Fig. 10. — Same as Fig. 8, but with only the blue galaxies (those with u — r< 2.22) plotted 
(24,012 galaxies). Rms Az is ~ 0.0303. 



